week 6: integers and other monsters
ordered categories
Ordered categorical outcomes
Ordered categories are variables that are discrete, like a count, except that the values merely indicate different ordered levels.1 Most Likert scale questions are ordered categories, even though we pretend they’re continuous (see polychoric correlations).
If we want to model an outcome as an ordered categorical variable, what we’re really asking is how does moving along an associated predictor variable move predictions in the outcome progressively through the categories in sequence . We must therefore model our outcomes in the right order.
To do so, we’ll use the principles of the generalized linear model discussed last week. That is, we use a link function – the CUMULATIVE LINK FUNCTION – to model the probability of a value that is that value or smaller.
example: the trolley problem
data (Trolley, package = "rethinking" )
trolley <- Trolley
head (trolley)
case response order id age male edu action intention contact
1 cfaqu 4 2 96;434 14 0 Middle School 0 0 1
2 cfbur 3 31 96;434 14 0 Middle School 0 0 1
3 cfrub 4 16 96;434 14 0 Middle School 0 0 1
4 cibox 3 32 96;434 14 0 Middle School 0 1 1
5 cibur 3 4 96;434 14 0 Middle School 0 1 1
6 cispe 3 9 96;434 14 0 Middle School 0 1 1
story action2
1 aqu 1
2 bur 1
3 rub 1
4 box 1
5 bur 1
6 spe 1
trolley %>%
ggplot (aes ( x= response )) +
geom_histogram ()
Code
trolley %>%
count (response) %>%
mutate (pr_k = n/ nrow (trolley),
cum_pr_k = cumsum (pr_k)) %>%
ggplot (aes (x = response, y = cum_pr_k)) +
geom_point (size = 3 , color = "#1c5253" ) +
geom_line (color = "#1c5253" , alpha = .3 ) +
labs (x = "response" ,
y = "cumulative probability" )
Code
# primary data
trolley_plot <-
trolley %>%
count (response) %>%
mutate (pr_k = n / nrow (trolley),
cum_pr_k = cumsum (n / nrow (trolley))) %>%
mutate (discrete_probability = ifelse (response == 1 , cum_pr_k, cum_pr_k - pr_k))
# annotation
text <-
tibble (text = 1 : 7 ,
response = seq (from = 1.25 , to = 7.25 , by = 1 ),
cum_pr_k = trolley_plot$ cum_pr_k - .065 )
trolley_plot %>%
ggplot (aes (x = response, y = cum_pr_k,
color = cum_pr_k, fill = cum_pr_k)) +
geom_line () +
geom_point (shape = 21 , colour = "grey92" ,
size = 2.5 , stroke = 1 ) +
geom_linerange (aes (ymin = 0 , ymax = cum_pr_k),
alpha = 1 / 2 ) +
geom_linerange (aes (x = response + .025 ,
ymin = ifelse (response == 1 , 0 , discrete_probability),
ymax = cum_pr_k),
color = "black" ) +
# number annotation
geom_text (data = text,
aes (label = text),
size = 4 ) +
scale_x_continuous (breaks = 1 : 7 ) +
scale_y_continuous ("cumulative proportion" , breaks = c (0 , .5 , 1 ),
limits = c (0 , 1.1 )) +
theme (axis.ticks = element_blank (),
axis.title.y = element_text (angle = 90 ),
legend.position = "none" )
On to the modeling. We’ll start with an intercept-only model.
\[\begin{align*}
R_i &\sim \text{Categorical}(p) \\
\text{logit}(p_k) = \alpha_k - \phi \\
\phi &= 0 \\
\alpha_k &\sim \text{Normal}(0, 1.5)
\end{align*}\]
Remember, \(\phi\) doesn’t have an intercept because there’s a \(\alpha\) for each cut point.
m1 <- brm (
data = trolley,
family = cumulative,
response ~ 1 ,
prior = c ( prior (normal (0 , 1.5 ), class = Intercept) ),
iter= 2000 , warmup= 1000 , cores= 4 , chains= 4 ,
file= here ("files/models/62.1" )
)
\(\alpha_K\) denotes the \(K-1\) cut points or thresholds.
\(\phi\) is a stand-in for a potential linear model.
Family: cumulative
Links: mu = logit; disc = identity
Formula: response ~ 1
Data: trolley (Number of observations: 9930)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept[1] -1.92 0.03 -1.98 -1.86 1.00 2725 3116
Intercept[2] -1.27 0.02 -1.32 -1.22 1.00 3960 3485
Intercept[3] -0.72 0.02 -0.76 -0.68 1.00 4323 3760
Intercept[4] 0.25 0.02 0.21 0.29 1.00 4822 3486
Intercept[5] 0.89 0.02 0.85 0.93 1.00 4364 3378
Intercept[6] 1.77 0.03 1.71 1.83 1.00 4658 3543
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
disc 1.00 0.00 1.00 1.00 NA NA NA
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Implementation of the cumulative-logit model is
P(Y ≤ k) = F(disc * (thres[k] - mu))
Where:
thres[k] - These are the threshold parameters (often called “cutpoints” or α_k in your lecture notes). They represent the boundaries between adjacent categories on the latent continuous scale. In brms, these are the “Intercept[k]” parameters you see in the model output.
mu - This is the linear predictor, which can include fixed effects, random effects, etc. It represents the location of an observation on the latent scale.
disc - This is a discrimination parameter (sometimes called a scale parameter). It controls how quickly the probabilities change as you move along the latent scale. Higher values mean sharper distinctions between categories.
In lecture, we formulate cumulative logit as
logit(p_k) = α_k - φ_i
Where:
α_k corresponds to thres[k] in brms
φ_i corresponds to mu in brms
The discrimination parameter is implicitly set to 1
The negative sign in front of φ_i in your formulation corresponds to how brms parameterizes the model with (thres[k] - mu) rather than (mu - thres[k]).
The thresholds are in the logit-metric. Let’s convert these back to probabilities.
gather_draws (m1, ` b_.* ` , regex= T) %>%
mutate (.value = inv_logit_scaled (.value)) %>%
mean_qi
# A tibble: 6 × 7
.variable .value .lower .upper .width .point .interval
<chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 b_Intercept[1] 0.128 0.122 0.135 0.95 mean qi
2 b_Intercept[2] 0.220 0.212 0.228 0.95 mean qi
3 b_Intercept[3] 0.328 0.319 0.337 0.95 mean qi
4 b_Intercept[4] 0.562 0.552 0.571 0.95 mean qi
5 b_Intercept[5] 0.709 0.700 0.718 0.95 mean qi
6 b_Intercept[6] 0.854 0.847 0.861 0.95 mean qi
exercise
Create a plot showing the posterior predictive distribution for the intercept-only model (m1). Can you visualize both the posterior and the observed data?
solution
Code
predicted_draws (m1, newdata = data.frame (1 )) %>%
count (.prediction) %>%
mutate (pr_k = n / sum (n)) %>%
ggplot (aes (x = .prediction, y = pr_k)) +
geom_col (fill = "#1c5253" , alpha = 0.7 ) +
geom_point (data = trolley_plot, aes (x = response),
color = "#3d405b" , size = 2 ) +
labs (x = "response" ,
y = "probability" ,
title = "Posterior Predictive Distribution" )
adding predictors
Now let’s add the features of the story – contact, action, and intention – as predictors in our model. Let’s expand our mathematical model. Here’s our original:
\[\begin{align*}
R_i &\sim \text{Categorical}(p) \\
\text{logit}(p_k) = \alpha_k - \phi_i \\
\phi_i &= 0 \\
\alpha_k &\sim \text{Normal}(0, 1.5)
\end{align*}\]
McElreath proposes the following causal formula:
\[
\phi_i = \beta_1\text{action}_i + \beta_2\text{contact}_i + \beta_3\text{intention}_i
\]
Code
m2 <- brm (
data = trolley,
family = cumulative,
response ~ 1 + action + contact + intention,
prior = c ( prior (normal (0 , 1.5 ), class = Intercept),
prior (normal (0 , 0.5 ), class = b)),
iter= 2000 , warmup= 1000 , cores= 4 , chains= 4 ,
file= here ("files/models/62.2" )
)
m2
Family: cumulative
Links: mu = logit; disc = identity
Formula: response ~ 1 + action + contact + intention
Data: trolley (Number of observations: 9930)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept[1] -2.83 0.05 -2.92 -2.74 1.00 2960 2654
Intercept[2] -2.15 0.04 -2.23 -2.06 1.00 3324 2745
Intercept[3] -1.56 0.04 -1.64 -1.49 1.00 2997 2965
Intercept[4] -0.54 0.04 -0.61 -0.47 1.00 3180 2764
Intercept[5] 0.12 0.04 0.05 0.20 1.00 3373 3358
Intercept[6] 1.03 0.04 0.95 1.11 1.00 3586 3183
action -0.70 0.04 -0.78 -0.62 1.00 3744 3404
contact -0.95 0.05 -1.05 -0.85 1.00 3038 2764
intention -0.72 0.04 -0.79 -0.64 1.00 4178 3090
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
disc 1.00 0.00 1.00 1.00 NA NA NA
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Code
gather_draws (m2, ` b_.* ` , regex= T) %>%
filter (! str_detect (.variable, "[0-9]" )) %>%
mutate (.variable = str_remove (.variable, "b_" )) %>%
ggplot ( aes (x = .value, y = .variable) ) +
geom_vline ( aes (xintercept = 0 ), linetype = "dashed" , alpha = .3 ) +
stat_dist_halfeye (fill = "#5e8485" ) +
scale_x_continuous ("marginal posterior" , breaks = - 5 : 0 / 4 ) +
scale_y_discrete (NULL )
nd <-
trolley %>%
distinct (action, contact, intention) %>%
mutate (combination = str_c (action, contact, intention, sep = "_" ))
f <- predicted_draws ( m2 , newdata = nd )
f
# A tibble: 24,000 × 9
# Groups: action, contact, intention, combination, .row [6]
action contact intention combination .row .chain .iteration .draw
<int> <int> <int> <chr> <int> <int> <int> <int>
1 0 1 0 0_1_0 1 NA NA 1
2 0 1 0 0_1_0 1 NA NA 2
3 0 1 0 0_1_0 1 NA NA 3
4 0 1 0 0_1_0 1 NA NA 4
5 0 1 0 0_1_0 1 NA NA 5
6 0 1 0 0_1_0 1 NA NA 6
7 0 1 0 0_1_0 1 NA NA 7
8 0 1 0 0_1_0 1 NA NA 8
9 0 1 0 0_1_0 1 NA NA 9
10 0 1 0 0_1_0 1 NA NA 10
# ℹ 23,990 more rows
# ℹ 1 more variable: .prediction <fct>
Code
f %>%
as.data.frame () %>%
mutate (
across ( c (action, contact, intention) ,
as.character),
action = str_c ("action = " , action),
contact = str_c ("contact = " , contact)) %>%
count (action, contact, intention, .prediction) %>%
ggplot ( aes (x= .prediction, y= n, color= intention) )+
geom_point () +
geom_line (aes (group= intention)) +
facet_grid (~ action+ contact) +
labs ( x= "response" ,
y= "count" )
exercise
Create a new model that models the interaction between intention and the other variables. Then:
Compare the fits of these models using PSIS or WAIC.
Plot the posterior predictive distribution. How does it differ from our previous model?
solution
m2_complex <- brm (
data = trolley,
family = cumulative,
response ~ 1 + action + contact + intention +
action: intention + contact: intention,
prior = c (prior (normal (0 , 1.5 ), class = Intercept),
prior (normal (0 , 0.5 ), class = b)),
iter = 2000 , warmup = 1000 , cores = 4 , chains = 4 ,
file= here ("files/models/62.2com" )
)
solution: compare
m2 <- add_criterion (m2, "loo" )
m2_complex <- add_criterion (m2_complex, "loo" )
loo_compare (m2, m2_complex, criterion = "loo" ) %>%
print (simplify = F)
elpd_diff se_diff elpd_loo se_elpd_loo p_loo se_p_loo looic
m2_complex 0.0 0.0 -18464.7 40.4 11.0 0.1 36929.3
m2 -80.3 12.5 -18545.0 37.9 9.0 0.0 37090.0
se_looic
m2_complex 80.8
m2 75.8
m2 <- add_criterion (m2, "waic" )
m2_complex <- add_criterion (m2_complex, "waic" )
loo_compare (m2, m2_complex, criterion = "waic" ) %>%
print (simplify = F)
elpd_diff se_diff elpd_waic se_elpd_waic p_waic se_p_waic
m2_complex 0.0 0.0 -18464.6 40.4 11.0 0.1
m2 -80.3 12.5 -18545.0 37.9 9.0 0.0
waic se_waic
m2_complex 36929.3 80.8
m2 37090.0 75.8
solution: ppd
Code
predicted_draws (object = m2_complex, newdata = nd) %>%
mutate (
intention= as.character (intention),
action = str_c ("action = " , action),
contact = str_c ("contact = " , contact)) %>%
count (action, contact, intention, .prediction) %>%
ggplot ( aes (x= .prediction, y= n, color= intention) )+
geom_point () +
geom_line (aes (group= intention)) +
facet_grid (~ action+ contact) +
labs ( x= "response" ,
y= "count" )
solution: ppd
Code
p1 = predicted_draws (object = m2, newdata = nd) %>%
mutate (model = "simple" )
p2 = predicted_draws (object = m2_complex, newdata = nd) %>%
mutate (model = "complex" )
p1 %>% full_join (p2) %>%
mutate (
intention= as.character (intention),
action = str_c ("action = " , action),
contact = str_c ("contact = " , contact)) %>%
count (model, action, contact, intention, .prediction) %>%
ggplot ( aes (x= .prediction, y= n,
color= interaction (intention,model,
sep= "-" ,lex.order= TRUE )) )+
geom_point () +
geom_line (aes (group= interaction (intention,model,sep= "-" ,lex.order= TRUE ))) +
facet_grid (~ action+ contact) +
labs ( x= "response" ,
y= "count" ,
color = "intention+model" )
Ordered categorical predictors
edu
1 Middle School
2 Bachelor's Degree
3 Some College
4 Master's Degree
5 High School Graduate
6 Graduate Degree
7 Some High School
8 Elementary School
trolley <-
trolley %>%
mutate (edu_new =
recode (edu,
"Elementary School" = 1 ,
"Middle School" = 2 ,
"Some High School" = 3 ,
"High School Graduate" = 4 ,
"Some College" = 5 ,
"Bachelor's Degree" = 6 ,
"Master's Degree" = 7 ,
"Graduate Degree" = 8 ) %>%
as.integer ())
To incorporate this variable as a MONOTONIC CATEOGORICAL PREDICTOR , we’ll set up a notation such that each step up in value comes with its own incremental or marginal effect on the outcome.
\[
\phi_i = \sum_{j=1}^7 \delta_j
\] We only need 7 because the first category is absorbed into the intercept for \(\phi\) . So our full formula for this parameter is: \[
\phi_i = \beta_1\text{action}_i + \beta_2\text{contact}_i + \beta_3\text{intention}_i + \beta_E\sum_{j=0}^{E_i-1} \delta_j
\] where \(\beta_E\) is the average effect
\[\begin{align*}
\text{response}_i &\sim \text{Categorical}(\mathbf{p}) \\
\text{logit}(p_k) &= \alpha_k - \phi_i \\
\phi_i &= \beta_1\text{action}_i + \beta_2\text{contact}_i + \beta_3\text{intention}_i + \beta_4 \text{ mo(edu_new}_i, \boldsymbol{\delta}) \\
\alpha_k &\sim \text{Normal}(0, 1.5) \\
\beta_1, \ldots, \beta_3 &\sim \text{Normal}(0, 1) \\
\beta_4 &\sim \text{Normal}(0, 0.143) \\
\boldsymbol{\delta} &\sim \text{Dirichlet}(2, 2, 2, 2, 2, 2, 2),
\end{align*}\]
m3 <-
brm (data = trolley,
family = cumulative,
response ~ 1 + action + contact + intention + mo (edu_new), # note the `mo()` syntax
prior = c (prior (normal (0 , 1.5 ), class = Intercept),
prior (normal (0 , 1 ), class = b),
# note the new kinds of prior statements
prior (normal (0 , 0.143 ), class = b, coef = moedu_new),
prior (dirichlet (2 , 2 , 2 , 2 , 2 , 2 , 2 ), class = simo, coef = moedu_new1)),
iter = 2000 , warmup = 1000 , cores = 4 , chains = 4 ,
seed = 12 ,
file = here ("files/models/62.3" ))
Warning: this will probably take 20 minutes or so to run on your laptop. You can download my model file here.
Family: cumulative
Links: mu = logit; disc = identity
Formula: response ~ 1 + action + contact + intention + mo(edu_new)
Data: trolley (Number of observations: 9930)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept[1] -3.13 0.17 -3.53 -2.84 1.00 1707 2012
Intercept[2] -2.45 0.17 -2.83 -2.16 1.00 1702 1934
Intercept[3] -1.87 0.17 -2.25 -1.58 1.00 1708 1804
Intercept[4] -0.85 0.17 -1.23 -0.56 1.00 1731 1956
Intercept[5] -0.18 0.17 -0.56 0.11 1.00 1718 1925
Intercept[6] 0.73 0.17 0.35 1.01 1.00 1737 1988
action -0.71 0.04 -0.78 -0.63 1.00 4604 3182
contact -0.96 0.05 -1.06 -0.86 1.00 4521 2761
intention -0.72 0.04 -0.79 -0.65 1.00 4514 2992
moedu_new -0.05 0.03 -0.11 -0.00 1.00 1705 1866
Monotonic Simplex Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
moedu_new1[1] 0.26 0.15 0.04 0.59 1.00 2292 3104
moedu_new1[2] 0.14 0.09 0.02 0.35 1.00 4405 2501
moedu_new1[3] 0.19 0.11 0.03 0.44 1.00 4005 2613
moedu_new1[4] 0.16 0.09 0.03 0.39 1.00 3636 2564
moedu_new1[5] 0.04 0.05 0.00 0.15 1.00 2606 1894
moedu_new1[6] 0.09 0.06 0.01 0.25 1.00 3747 2484
moedu_new1[7] 0.12 0.07 0.02 0.29 1.00 3829 2596
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
disc 1.00 0.00 1.00 1.00 NA NA NA
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Let’s see the effect of education after controlling for story parameters.
Code
nd <- distinct (
trolley,
action, contact, intention, edu_new
)
predicted_draws (m3, nd) %>%
ungroup () %>%
count (edu_new, .prediction) %>%
with_groups (edu_new, mutate, total= sum (n)) %>%
mutate (pk = n/ total,
.prediction = as.numeric (.prediction),
edu_new = as.factor (edu_new)) %>%
ggplot (aes (x = .prediction, y = pk, color = edu_new)) +
geom_point () +
geom_line () +
labs ( x= "response" ,
y= "probability" ) +
theme (legend.position = "top" )
exercise
Create a new model that includes education as a regular categorical predictor (not monotonic) and compare it to m3. Your tasks:
Fit the model using the original edu variable (not edu_new)
Create a plot of the posterior predictive distribution from the new model.
OPTIONAL 3. Use create a side-by-side visual comparison of the education effects 4. Interpret whether the monotonic constraint appears to improve the model fit
solution: fit model
Code
# Fit model with regular categorical education
m3_cat <- brm (
data = trolley,
family = cumulative,
response ~ 1 + action + contact + intention + edu,
prior = c (prior (normal (0 , 1.5 ), class = Intercept),
prior (normal (0 , 1 ), class = b)),
iter = 2000 , warmup = 1000 , cores = 4 , chains = 4 ,
file= here ("files/models/62.3cat" )
)
m3_cat
Family: cumulative
Links: mu = logit; disc = identity
Formula: response ~ 1 + action + contact + intention + edu
Data: trolley (Number of observations: 9930)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
Intercept[1] -2.96 0.05 -3.06 -2.86 1.00 2945
Intercept[2] -2.28 0.05 -2.37 -2.18 1.00 3064
Intercept[3] -1.69 0.05 -1.78 -1.60 1.00 3071
Intercept[4] -0.66 0.04 -0.75 -0.58 1.00 3178
Intercept[5] 0.01 0.04 -0.07 0.10 1.00 3191
Intercept[6] 0.92 0.05 0.83 1.01 1.00 3731
action -0.71 0.04 -0.79 -0.63 1.00 4290
contact -0.96 0.05 -1.07 -0.87 1.00 3938
intention -0.72 0.04 -0.79 -0.65 1.00 4548
eduElementarySchool 0.91 0.20 0.51 1.30 1.00 4417
eduGraduateDegree -0.17 0.06 -0.30 -0.06 1.00 3902
eduHighSchoolGraduate -0.06 0.07 -0.19 0.07 1.00 4060
eduMastersDegree -0.11 0.06 -0.22 -0.00 1.00 3674
eduMiddleSchool -0.25 0.15 -0.54 0.04 1.00 4061
eduSomeCollege -0.31 0.05 -0.40 -0.22 1.00 3476
eduSomeHighSchool 0.13 0.09 -0.05 0.31 1.00 4017
Tail_ESS
Intercept[1] 3032
Intercept[2] 3242
Intercept[3] 3462
Intercept[4] 3403
Intercept[5] 3188
Intercept[6] 3322
action 3188
contact 3308
intention 3197
eduElementarySchool 2639
eduGraduateDegree 2979
eduHighSchoolGraduate 2920
eduMastersDegree 3003
eduMiddleSchool 2889
eduSomeCollege 3245
eduSomeHighSchool 3187
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
disc 1.00 0.00 1.00 1.00 NA NA NA
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
solution: ppd
Code
nd <- distinct (
trolley,
action, contact, intention, edu
)
predicted_draws (m3_cat, nd) %>%
ungroup () %>%
count (edu, .prediction) %>%
mutate (
edu = factor (edu,
levels = c (
"Elementary School" ,
"Middle School" ,
"Some High School" ,
"High School Graduate" ,
"Some College" ,
"Bachelor's Degree" ,
"Master's Degree" ,
"Graduate Degree" ))) %>%
with_groups (edu, mutate, total= sum (n)) %>%
mutate (pk = n/ total,
.prediction = as.numeric (.prediction),
edu_new = as.factor (edu)) %>%
ggplot (aes (x = .prediction, y = pk, color = edu)) +
geom_point () +
geom_line () +
labs ( x= "response" ,
y= "probability" ) +
theme (legend.position = "top" )
solution: side-by-side
Code
p1 <- predicted_draws (m3, newdata = distinct (trolley, action, contact, intention, edu_new)) %>%
mutate (model = "ordered" ,
edu = factor (edu_new,
levels = c (1 : 8 ),
labels = c (
"Elementary School" ,
"Middle School" ,
"Some High School" ,
"High School Graduate" ,
"Some College" ,
"Bachelor's Degree" ,
"Master's Degree" ,
"Graduate Degree" )))
p2 <- predicted_draws (m3_cat, newdata = distinct (trolley, action, contact, intention, edu)) %>%
mutate (model = "categorical" )
full_join (p1, p2) %>%
ungroup () %>%
count (model, edu, .prediction) %>%
with_groups (c (model,edu), mutate, total = sum (n)) %>%
mutate (pk = n/ total,
.prediction = as.numeric (.prediction)) %>%
ggplot (aes (x = .prediction, y = pk, color = edu)) +
geom_point (aes (shape = model)) +
geom_line (aes (linetype = model)) +
labs (x = "response" , y = "probability" ) +
guides (color = "none" ) +
facet_wrap (~ edu) +
theme (legend.position = "top" )
solution: compare fit
m3 <- add_criterion (m3, "loo" )
m3_cat <- add_criterion (m3_cat, "loo" )
loo_compare (m3, m3_cat, criterion = "loo" ) %>%
print (simplify = F)
elpd_diff se_diff elpd_loo se_elpd_loo p_loo se_p_loo looic
m3_cat 0.0 0.0 -18510.5 38.9 15.3 0.2 37021.0
m3 -30.2 7.7 -18540.7 38.1 11.2 0.1 37081.4
se_looic
m3_cat 77.7
m3 76.2
m3 <- add_criterion (m3, "waic" )
m3_cat <- add_criterion (m3_cat, "waic" )
loo_compare (m3, m3_cat, criterion = "waic" ) %>%
print (simplify = F)
elpd_diff se_diff elpd_waic se_elpd_waic p_waic se_p_waic waic
m3_cat 0.0 0.0 -18510.5 38.9 15.2 0.2 37021.0
m3 -30.2 7.7 -18540.7 38.1 11.2 0.1 37081.4
se_waic
m3_cat 77.7
m3 76.2